Link resolution #458

mpacer · 2016-11-05T00:32:45Z

Close #11 by passing an intermediate JSON representation through a filter that directly manipulates the underlying pandoc AST as part of a 2 step conversion process.

It also creates a general utility for walking through this representation in other cases based on the https://github.com/jgm/pandocfilters library with a new utility (applyJSONFilters) that we are mirroring until/if jgm/pandocfilters#49 is merged.

@takluyver

takluyver · 2016-11-05T04:30:29Z

nbconvert/filters/filter_links.py

+    if key == 'Link':
+        target = val[2][0]
+        # Links to other notebooks
+        m = re.match(r'(\d+\-.+)\.ipynb$', target)


The cross-notebook linking only works as part of bookbook, so we should probably take it out of here to avoid confusion.

takluyver · 2016-11-05T04:31:02Z

nbconvert/filters/filter_links.py

+            return RawInline('tex', 'Section \\ref{sec:%s}' % m.group(1))
+
+        # Links to sections of this or other notebooks
+        m = re.match(r'(\d+\-.+\.ipynb)?#(.+)$', target)


Ditto here, the first part of this only makes sense for bookbook joining multiple notebooks together.

takluyver · 2016-11-05T04:32:03Z

nbconvert/filters/pandoc.py

+#    mydir = os.path.dirname(os.path.abspath(__file__))
+#    filter_links = os.path.join(mydir, 'filter_links.py')
+#    if extra_args is not None:
+#        extra_args.extend(['--filter',filter_links])


You probably didn't mean to commit this?

Yes meant to delete that is left over from the efficiency prototyping

takluyver · 2016-11-05T04:34:02Z

nbconvert/templates/latex/document_contents.tplx

@@ -64,7 +64,7 @@

 % Render markdown
 ((* block markdowncell scoped *))
-    ((( cell.source | citation2latex | strip_files_prefix | convert_pandoc('markdown', 'latex') )))
+    ((( cell.source | citation2latex | strip_files_prefix | convert_pandoc('markdown', 'json',extra_args=[]) | wrapped_convert_link | convert_pandoc('json','latex'))))


Open API question - do we want to have a function which encapsulates these last three steps into one:

pandoc to json

apply Python filters

pandoc from json

I was thinking about doing that, but I figured that should be a different PR that demonstrates the generality of this, whereas this is more of a bugfix.

new issue #462 to address this

takluyver · 2016-11-09T23:08:09Z

nbconvert/utils/pandoc.py

@@ -109,6 +109,28 @@ def check_pandoc_version():
                       RuntimeWarning, stacklevel=2)
    return ok

+try: 
+    from pandocfilters import applyJSONFilters
+except ImportError:


John did a new release of pandocfilters for us, so we can rely on >=1.4.1 and drop the falllback here.

Sounds good.

takluyver · 2016-11-10T18:46:24Z

nbconvert/exporters/templateexporter.py

@@ -50,6 +50,7 @@
        'prevent_list_blocks': filters.prevent_list_blocks,
        'get_metadata': filters.get_metadata,
        'convert_pandoc': filters.convert_pandoc,
+        'wrapped_convert_link': wrapped_convert_link,


I'd like the name here to be a bit more descriptive: wrapped is an implementation detail, and it's not clear what it means (wrapped in what?). A descriptive name might be like pandoc_internal_links_to_latex_refs. That is a bit wordy, so better suggestions welcome.

I also wonder if it should only be registered for the latex exporter - then the name could be a bit less precise. Adding a filter for an exporter works like this (example from bookbook):

def default_filters(self): yield from super(MyHTMLExporter, self).default_filters() yield ('markdown2html', markdown2html_custom)

Should be addressed

hmmm, I don't think that this idiom works exactly as is (also I think this is not quite how bookbook does it, https://github.com/takluyver/bookbook/blob/8d7014d8e9775789ce62d092921aa56898da66c0/bookbook/html.py#L29) I'll work on it.

takluyver · 2016-11-10T18:47:29Z

nbconvert/utils/pandoc.py

@@ -109,6 +109,22 @@ def check_pandoc_version():
                       RuntimeWarning, stacklevel=2)
    return ok

+    def applyJSONFilters(actions, source, format=""):


We shouldn't need this function now that it's in pandoc.

takluyver · 2016-11-11T23:53:24Z

nbconvert/filters/filter_links.py

+
+def resolve_one_reference(key, val, fmt, meta):
+    """
+    """


Let's either put a docstring in or remove the empty docstring (I'm happy for this to have no docstring).

takluyver · 2016-11-12T00:08:59Z

nbconvert/exporters/latex.py

@@ -43,6 +43,10 @@ def _template_skeleton_path_default(self):
    template_extension = Unicode(".tplx").tag(config=True)

    output_mimetype = 'text/latex'
+
+    def default_filters(self):
+        yield from super(TemplateExporter,self).default_filters()


We should be passing in the current class here, i.e. LatexExporter, not TemplateExporter.

Python 3 improved on this, so you can just call super().default_filters() as we do in bookbook. But nbconvert still has to support Python 2 for now.

For some reason even when running in my Python 3 environment, that doesn't work. with the current code.

However, the Python 2 compatible way of doing of it does work. Any idea why?

Not sure off the top of my head. If you're interested in working it out, we can try to debug on the Hacktrain tomorrow.

mpacer · 2016-11-12T09:52:14Z

I do not like that this causes a ridiculous slowdown in the tests. I imagine this will negatively impact anyone who is converting a large number of notebooks.

I'm ok merging this but we should make it a priority then figuring out if we can do some kind of conditional filtering (something like adding a piece of metadata to a cell if it has a link in it (or even a specifically internal link in it) as part of a postprocessing step when saving the notebook. I'm not sure but my impression is that this may be appropriate in the https://github.com/jupyter/nbformat repo? Or if that does no analysis of the content of the notebook file, something as part of the saving mechanism inside the notebook itself.

takluyver · 2016-11-12T16:31:27Z

nbconvert/exporters/latex.py

@@ -44,6 +44,10 @@ def _template_skeleton_path_default(self):

    output_mimetype = 'text/latex'

+    def default_filters(self):
+        yield from super(LatexExporter, self).default_filters()


Sorry, I forgot when I gave the example that yield from is Python 3 only. To make this work on Python 2, we'll need to expand it:

for f in super(LatexExporter, self).default_filters(): yield f

takluyver · 2016-11-12T16:43:20Z

Well, we knew it was going to be slower, right? We can look at ways to mitigate that, like maybe offering an option to skip the link processing entirely.

I don't like the idea of doing stuff on save to support this:

It's extra complexity and extra slowness on every save for something which is only relevant for converting notebooks to latex. Most notebooks will never be converted to latex.
It requires a change in every tool writing notebooks (e.g. nteract, Pycharm), and they would have to inspect Markdown, which is currently unnecessary.

mpacer mentioned this pull request Nov 5, 2016

Separate writing from filter application in toJSONFilters jgm/pandocfilters#49

Merged

takluyver reviewed Nov 5, 2016

View reviewed changes

takluyver reviewed Nov 9, 2016

View reviewed changes

mpacer force-pushed the link_resolution branch from 52a66c8 to f4fb3d8 Compare November 10, 2016 17:59

takluyver reviewed Nov 10, 2016

View reviewed changes

takluyver reviewed Nov 11, 2016

View reviewed changes

takluyver reviewed Nov 12, 2016

View reviewed changes

mpacer added 16 commits November 13, 2016 16:07

adding link filtering

6c7ae12

Include filter link function

7c351b8

Remove internal mechanism and put in pandocfilters

1216e24

Move utility as backup in pandoc utils

a7ca351

Use fallback if you cannot get applyJSONFilters from pandocfilters

6326932

Make dependency on new pandocfilters explicit

f0b0d3b

explicit dependency on applyJSONFilters

18cbee4

remove commented lines

2a45d71

fix the regex

19b9c7f

remove applyJSONFilters

dfb6b8f

remove wrapped_convert_link

d07e118

change naming scheme for latex link filtering

178ce3d

remove call to nonexistant wrapped_convert_link

e3155e5

invoke the superclass appropriately

6ee0ebb

Add docstrings, fix comma spacing

fa79c87

Use python2 compatible syntax

9f30289

mpacer force-pushed the link_resolution branch from 1c157ac to 9f30289 Compare November 14, 2016 00:08

takluyver added this to the 5.0 milestone Nov 14, 2016

takluyver merged commit 5234c3e into jupyter:master Nov 14, 2016

mpacer mentioned this pull request Dec 7, 2016

Add @mgeier's math tagging for rst conversion[WIP] #416

Open

mpacer mentioned this pull request Jan 4, 2017

Add links to tests by having them in the example notebook #505

Merged

takluyver mentioned this pull request Mar 12, 2019

Support pandoc 2 #964

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Link resolution #458

Link resolution #458

mpacer commented Nov 5, 2016

takluyver Nov 5, 2016

takluyver Nov 5, 2016

takluyver Nov 5, 2016

mpacer Nov 5, 2016

takluyver Nov 5, 2016

mpacer Nov 10, 2016

mpacer Nov 10, 2016

takluyver Nov 9, 2016

mpacer Nov 10, 2016

takluyver Nov 10, 2016

mpacer Nov 11, 2016

mpacer Nov 11, 2016 •

edited

Loading

takluyver Nov 10, 2016

takluyver Nov 11, 2016

takluyver Nov 12, 2016

mpacer Nov 12, 2016

takluyver Nov 12, 2016

mpacer commented Nov 12, 2016

takluyver Nov 12, 2016

takluyver commented Nov 12, 2016

Link resolution #458

Link resolution #458

Conversation

mpacer commented Nov 5, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mpacer Nov 11, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mpacer commented Nov 12, 2016

Choose a reason for hiding this comment

takluyver commented Nov 12, 2016

mpacer Nov 11, 2016 •

edited

Loading